The crime situation, the safety of the area and the living environment are important issues. Through analyzing and visualizing Buffalo's crime data, the project provides a better view and insight into the current social situation in the area such as common crime types or dates and time frames, or the neigborhood often occurs criminal activities. Not only providing knowledge, this project also helps viewers to wake up and raise their vigilance.
By working with spatial attributions, this project focus on building customized analytical modules for processing and analysis of geospatial data. The goals of this project is to provide information about crime's locations in Buffalo by geospatial mapping such as 2D map, interactive point frequency maps, and interactive point distribution maps.
Source: Buffalo Open Data - Crime Incidents
This dataset is information about crime incidents of Buffalo.
The dataset was created in September 6, 2017 and was updated in May 22, 2022.
There are total 281601 records and 29 attribute fields.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import math
As mentioned, there are 29 columns. I just choose to read specific 7 columns that are needed for this project.
# Read the dataset from url, add ?$limit=300000 to read all records
crime_url = 'https://data.buffalony.gov/resource/d6g9-xbgu.csv?$limit=300000'
crime = pd.read_csv(crime_url, usecols=['case_number','incident_datetime','parent_incident_type','hour_of_day','day_of_week',
'address_1','neighborhood_1'])
crime.tail()
crime.shape
crime.dtypes
Limitation of the dataset: Lacking numerical data.
The only numerical data which are useful and can combine with other data is Hour of Day.
Idealy crime data: contain information about number of injured people, dead people, etc.
crime['parent_incident_type'].value_counts()
crime['day_of_week'].value_counts()
crime['day_of_week'].unique()
crime['neighborhood_1'].value_counts()
crime['hour_of_day'].describe()
crime['hour_of_day'].unique()
# Year from datetime column
crime['incident_datetime'].str[0:4].value_counts()
The dataset was created in 2017. The years have the highest and second highest number of crime cases is 2007 and 2009. This is the time frame of Great Recession - the crisis led to crisis in many financial fields such as credit, insurance, securities.
# Month from datetime column
crime['incident_datetime'].str[5:7].value_counts()
The number of cases drops as the temperatery. July, August, September are warm months while March, February are colder months and have high average inches of snowfall. Cold weather with snow conditions of Buffalo effect the number of crime cases here. In cold months, crime were happened less frequency than it in warm months.
# number of missing values in each columns
crime.isnull().sum()
In total 279,677 cases:
There are 5 cases that are missed information about Incident Datetime.
39 cases are missed address information.
1024 cases are misses neighborhood information.
# Cases that do not have DateTime information
crime[crime['incident_datetime'].isnull()]
pd.set_option('display.max_rows',500)
crime.groupby(['neighborhood_1','parent_incident_type']).size()
crime.dropna(how='any',inplace=True)
crime.shape
crime['day_of_week']=crime['day_of_week'].str.upper()
crime['day_of_week'].value_counts()
len(crime['parent_incident_type'])
plt.figure(figsize=(12,5))
chart = sns.countplot(y='parent_incident_type', data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'{chart.get_ylabel().capitalize()}',fontweight='bold')
# add percentages for each bar
for p in chart.patches:
chart.text(p.get_width(),p.get_y()+0.5,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['parent_incident_type']))),ha='left')
plt.show()
43.62% - Almost haft of recorded crime incidents cases that happened in Buffalo are theft cases.
Top crime incidents are theft, assault, and breaking and entering.
# create function to draw multiple countplots
def plot_multiple_countplots(crime, cols,num_cols,num_rows, hue=None):
fig, axs = plt.subplots(num_rows, num_cols,figsize=(20, 10))
for index, col in enumerate(cols):
i = math.floor(index/num_cols)
j = index - i*num_cols
if num_rows == 1:
if num_cols == 1:
chart = sns.countplot(x=crime[col], ax=axs, hue = hue, palette='Spectral')
else:
chart = sns.countplot(x=crime[col], ax=axs[j],hue = hue, palette='Spectral')
else:
chart = sns.countplot(x=crime[col], ax=axs[i, j],hue = hue, palette='Spectral')
# rotate axis labels
chart.set_xticklabels(chart.get_xticklabels(), rotation=15, ha ='center')
# set names each countplot
chart.set_title(f'{chart.get_xlabel().capitalize()}',fontweight='bold')
# add percentages on top of each bar
for p in chart.patches:
chart.text(p.get_x(),p.get_height()+1,'{:1.2f}%'.format(p.get_height()*100/ float(len(crime[col]))),ha='left')
plot_multiple_countplots(crime, ['day_of_week','hour_of_day'],2,1)
1. Day of week:
2. Hour of Day:
plt.figure(figsize=(17,10))
chart = sns.boxplot(x='parent_incident_type', y='hour_of_day',data=crime, hue ='day_of_week' , palette='Spectral')
chart.set_title(f'Timeline of Incident Types',fontweight='bold')
plt.show()
Most crime cases about proverty such as Theft, Theft of Vehicle, Breaking & Entering happpen around 8 a.m and 4 p.m - the time frame of working hours.
Cases about interaction conflict such as Assault, Robbery, Sexual Assault and Homicide have a fluctuated time frame.
crime['Weekend'] = crime['day_of_week'].isin(['SATURDAY', 'SUNDAY'])
ax=sns.catplot(x='parent_incident_type', y='hour_of_day', hue='Weekend', kind='box', dodge=False, data=crime)
ax.fig.suptitle(f'Crime on Weekend',fontweight='bold')
ax.fig.set_size_inches(17,5)
x = [0,1,2,3,4,5,6,20,21,22,23]
crime['Night Time'] = crime['hour_of_day'].isin(x)
plt.figure(figsize=(12,7))
chart = sns.countplot(y='parent_incident_type', data=crime,hue='Night Time')
# set name for the plot
chart.set_title(f'Crime at Night',fontweight='bold')
# add percentages for each bar
for p in chart.patches:
chart.text(p.get_width(),p.get_y()+0.25,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['parent_incident_type']))),ha='left')
plt.show()
Night time in this project is from 8 p.m to 6 a.m.
Only Theft , Breaking & Entering, and Sexual Offense occur more at day time. Because day time, especically from 8 a.m to 4 p.m is the time frame of office working hours. People leaving for work, stay in the office are good condition for thief and intruder.
All other types of crime incident occur more at night time.
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'Neighborhood and Crime Cases',fontweight='bold')
for p in chart.patches:
chart.text(p.get_width(),p.get_y()+0.5,'{:1.2f}%'.format(p.get_width()*100/ float(len(crime['neighborhood_1']))),ha='left')
plt.show()
High frequency of crime - Dangerous Neighborhoods :
Broadway Fillmore
Central
Kensington-Bailey
Noth Park
Genesee-Moselle
Low frequency of crime - Safe Neighborhoods:
First Ward
Seneca Babcock
Central Park
Kaisertown
Ellicott
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime, hue= 'parent_incident_type',palette='Spectral')
# set name for the plot
chart.set_title(f'Neighborhood with Incident Type',fontweight='bold')
plt.show()
North Park is the neighborhood where incident happened in highest frequency and most cases are theft.
Neighborhoods that suffered from Theft: Noth Park, Broadway Fillmore, Central, Kensington-Bailey, Elmwood Bidwell and Elmwood Bryant.
Because 43.62% recorded cases are theft cases, so to have a closer look in other incident types that happened in different neighborhoods, this step remove all the theft cases.
# Remove all Theft cases
crime2 = crime
crime2 = crime2[crime2['parent_incident_type'].str.contains('Theft')==False]
# Draw chart
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', data=crime2, hue= 'parent_incident_type',palette='tab10')
# set name for the plot
chart.set_title(f'Neighborhood with Non-Theft Incident Type',fontweight='bold')
plt.show()
Without theft cases involved, neighborhoods are suffered from assault, breaking and entering.
High frequency of Assault: Broadway Fillmore, Genersee-Moselle, Schiller Park, Central, Kenfield and Delavan Crider.
High frequency of Breaking & Entering: Broadway Fillmore, Genersee-Moselle, Schiller Park, Kensington-Bailey, University Heights.
High frequency of Robbery: Broadway Fillmore, Genersee-Moselle, and University Heights.
Without theft cases involved, North Park is now no longer the most dangerous neighborhood.
plt.figure(figsize=(17,10))
chart = sns.countplot(y='neighborhood_1', hue = 'day_of_week',data=crime,palette='Spectral')
# set name for the plot
chart.set_title(f'Day of Crime in Neighborhood',fontweight='bold')
plt.show()
Central neigborhood is more dangrous at weekend.
All days of week arlarm: Broadway Fillmore, North Park, Kensington-Bailey, Schiller Park, Genersee-Moselle and Emlwood Bidwell.
crime.duplicated(subset=['address_1'],keep='first').sum()
=> 92.25% addresses had more than 2 crime cases in records.
crime.loc[crime.duplicated(subset=['address_1'], keep='first'),:]
%%time
!apt install gdal-bin python-gdal python3-gdal
# Install rtree - Geopandas requirment
!apt install python3-rtree
# Install Geopandas
!pip install git+git://github.com/geopandas/geopandas.git
# Install descartes - Geopandas requirment
!pip install descartes
!pip install geopandas
import geopandas as gpd
pd.set_option('display.max_columns',None)
# Add $limit=300000 to read in all records, defalt is 1000 records.
crime_url = "https://data.buffalony.gov/resource/d6g9-xbgu.geojson?$limit=300000"
crime_gdf = gpd.read_file(crime_url)
#crime_gdf = gpd.read_file(crime_url, ignore_fields=["iso_a3", "gdp_md_est"])
crime_gdf.tail()
!pip install contextily
import contextily as ctx
%matplotlib inline
crime_gdf.drop(['incident_id','updated_at'], axis=1,inplace=True)
crime_gdf.head()
CRS defines how the two-dimensional, projected map in Geographic information system (GIS) relates to real places on the earth.
Check the CRS and change it to epsg:3857 to be able to draw plots.
# Check crs
crime_gdf.crs
# Change crs
crime_gdf.to_crs('epsg:3857',inplace=True)
crime_gdf.shape
orig_rows = crime_gdf.shape[0]
crime_gdf = crime_gdf.loc[crime_gdf.geometry.notnull()]
print(f'Records with missing location information = {orig_rows-crime_gdf.shape[0]}')
#crime_gdf.geometry=crime_gdf.geometry.astype(float)
crime_gdf.dropna(subset =['geometry'], how='any',inplace=True)
#crime_gdf.dropna( how='any',inplace=True)
crime_gdf.shape
Delete 3959 records that is missed location information because they are not useful and cannot show on the map.
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf['neighborhood_1'], ax=ax);
ax.set_title('Crime Incident Locations of Buffalo by Neighborhood',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)
There are some bad geometry data that the locations are not in NY state.
So the map is so big, it is not only Buffalo area.
#crime_gdf = crime_gdf.GeoDataFrame.drop(columns=['incident_id'], axis=1, inplace=True)
crime_gdf.council_district.unique()
There are 'UNKNOWN' council district in the dataset that it cause above problem when mapping. To solve this problem, fixing it by removing the UNKNOWN council_district.
# set council_district as index of the dataframe
crime_gdf.set_index('council_district',inplace=True)
crime_gdf.head()
crime_gdf.drop(['UNKNOWN'] , axis=0,inplace=True)
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf['neighborhood_1'], ax=ax);
ax.set_title('Crime Incident Locations of Buffalo by Neighborhood',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)
After delete UNKNOWN council district records, the map now available show all cases in Buffalo area. Based on the map, almost every places in Buffalo have a record of crime incidents. Only the Sounth and Delaware County Districts show some blank area with no crime records. These area are parks.
crime_gdf.parent_incident_type.unique()
There are 9 types of crime in the dataset. This part is drawing plot that tell different of 2 crime types: Assault and Homicide.
crime_gdf.reset_index(inplace=True)
crime_gdf['conrank'] = 'lightgray'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Theft','conrank']='red'
crime_gdf.loc[crime_gdf.parent_incident_type == 'Assault','conrank']='blue'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Robbery','conrank']='purple'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Theft of Vehicle','conrank']='organce'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Breaking & Entering','conrank']='yellow'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Sexual Offense','conrank']='violet'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Other Sexual Offense','conrank']='brown'
#crime_gdf.loc[crime_gdf.parent_incident_type == 'Sexual Assault','conrank']='lime'
crime_gdf.loc[crime_gdf.parent_incident_type == 'Homicide','conrank']='deepPink'
crime_gdf.loc[~crime_gdf.parent_incident_type.isin(['Homicide','Assault']),'conrank']='gray'
import matplotlib.lines as mlines
fig, ax = plt.subplots(figsize=(12,12), subplot_kw=dict(aspect='equal'))
deepPink_marker = mlines.Line2D([], [], color='deepPink', marker='.', linestyle='None',
markersize=10,label='Homicide')
blue_marker = mlines.Line2D([], [], color='blue', marker='.', linestyle='None',
markersize=10,label='Assault')
gray_marker=mlines.Line2D([], [], color='gray', marker='.', linestyle='None',
markersize=10, label='Other types')
ax.legend(handles=[deepPink_marker,blue_marker,gray_marker])
crime_gdf.plot(color=crime_gdf['conrank'], ax=ax)
ax.set_title('Buffalo Assault and Homicide Crime Cases',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)
This map is showing the different about locations and number of cases in Assault type and Homicide type. Assault is the second most common type of crime that happended in Buffalo. Assault incidents were occurred a lot compare to Homicide incidents.
Although the quantity of cases is different, Assault and Homicide cases are both scattered occurred all around Buffalo.
There are a lot of locations that has more than 1 recorded crime cases. This part is to show the duplicated addresses on the dataset.
# Total duplcatated address here is smaller than above because I did remove some rows that missing geometry
crime_gdf.duplicated(subset=['address_1'],keep='first').sum()
fig, ax = plt.subplots(figsize=(15,15), subplot_kw=dict(aspect='equal'))
crime_gdf.plot(column=crime_gdf.duplicated(subset=['address_1'],keep='first'), ax=ax);
ax.set_title('>= 2 Crime Incidents Cases Locations of Buffalo',fontdict={'fontsize': '25', 'fontweight' : '3'})
ax.set_axis_off()
ctx.add_basemap(ax)
Yellow dots are locations of places where crime happened more than 2 times, and black dots are locations that only have 1 crime case in the dataset. There is a definitely different about quantity of these two category, more than 92% of locations had more than 2 crime cases in records.
Point locations represent where the actual event occurred. This approach is only viable if there are point locations with multiple occurrences of the geographic event under consideration.
from bokeh.tile_providers import CARTODBPOSITRON, get_provider
tileProvider = get_provider('CARTODBPOSITRON_RETINA')
from bokeh.io import output_notebook, show, output_file, save
from bokeh.plotting import figure
from bokeh.models import HoverTool, GeoJSONDataSource
from bokeh.layouts import row,column
from bokeh.models.widgets import Div
output_notebook()
TOOLS = "pan,wheel_zoom,box_zoom,reset,save"
kwargs = {"plot_width":800,
"plot_height":700,
"sizing_mode":'scale_both',
"outline_line_color":'#046626',
"outline_line_width":3,
"outline_line_alpha":.3,
'toolbar_location':'above',
'border_fill_color':'#4287f5',
'border_fill_alpha':.3,
'min_border_left': 20,
'min_border_right':20,
'min_border_top': 10,
'min_border_bottom':20}
# Check null geometry
orig_rows = crime_gdf.shape[0]
crime_gdf = crime_gdf.loc[crime_gdf.geometry.notnull()]
print(f'Records with missing location information = {orig_rows-crime_gdf.shape[0]:,.0f}\n\
Percent missing = {((orig_rows-crime_gdf.shape[0])/orig_rows)*100:,.0f}%')
This key is combine of Latitude and Longitude of locations that crimes happened more than 1 times
crime_gdf['newLoc'] = crime_gdf.geometry.x.astype(str)+ crime_gdf.geometry.y.astype(str)
numlocs = crime_gdf.newLoc.value_counts().rename_axis('uniquepts').to_frame('counts')
numlocs.head()
At some locations, crime incidents occurred in highly high rate. For example, at the location -8773756.9863626515302843.689697811 only, there were 1033 crime cases!
crime_gdf.geometry.value_counts().sum()
# Remove duplicate
uHl = crime_gdf.drop_duplicates(subset='newLoc').reset_index()
uHl.tail()
uHl.parent_incident_type.unique()
allHl = pd.merge(uHl,numlocs,left_on='newLoc',right_on='uniquepts').drop(['newLoc'],axis=1)
print(f'Number of locations: {allHl.shape[0]}\n\
accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
Wondering about locations of theft cases which is the most frequency crime type and homicide cases which is the most dangerous crime type.
# Theft cases
theftcases = allHl.loc[allHl.parent_incident_type =='Theft'].copy()
print(f'Number of Locations Theft cases: {theftcases.shape[0]}\n\
Accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
# Homicide cases
homicidecases = allHl.loc[allHl.parent_incident_type =='Homicide'].copy()
print(f'Number of Locations for Homicide cases: {homicidecases.shape[0]}\n\
Accounting for {allHl.counts.sum()} cases of crime incidents in Buffalo')
maxcir = 60
maxcnt = theftcases.counts.max()
theftcases['radius']=(theftcases.counts/maxcnt*maxcir)
theftcases['radius']=theftcases['radius'].astype(float).round().astype(int)
theftcases.head()
maxcir = 60
maxcnt = homicidecases.counts.max()
homicidecases['radius']=(homicidecases.counts/maxcnt*maxcir)
homicidecases['radius']=homicidecases['radius'].astype(float).round().astype(int)
homicidecases.head()
theftcases.to_crs('epsg:3857',inplace=True)
homicidecases.to_crs('epsg:3857',inplace=True)
output_file("/content/CrimePointFrequencyMaps.html",
title="Locations with Frequency Crime Incidents in Buffalo")
f1 = figure(title = "Location of Theft cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs)
f2 = figure(title = "Location of Homicide cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs,
x_range=f1.x_range,y_range=f1.y_range)
f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False
f2.add_tile(tileProvider)
f2.title.text_font_style = 'italic'
f2.title.text_font_size = '14pt'
f2.axis.visible=False
point_source_1 = GeoJSONDataSource(geojson=theftcases.to_json())
point_source_2 = GeoJSONDataSource(geojson=homicidecases.to_json())
Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
Circle2=f2.circle('x','y',size='radius',fill_color='red',line_color='red',fill_alpha=0.5,source=point_source_2)
c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[("Address","@address_1," "@council_district"),
(" " , " "),
("Number of Cases","@counts")]
f1.add_tools(c_hover)
c2_hover= HoverTool(renderers=[Circle2])
c2_hover.point_policy = "follow_mouse"
c2_hover.tooltips=[("Address","@address_1," "@council_district"),
(" " , " "),
("Number of Cases","@counts")]
f2.add_tools(c2_hover)
heading = Div(text="""<h1>Point Frequency Maps</h1>\
<p> The two maps below show locations and frequencies of theft and homicide crime cases in Buffalo.\
On the left, proportional point symbols show locations of theft cases and on the right are locations of homicide.</p>\
<p> Use the tools to the right of each map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")
layout = column(heading, row(f1,f2),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)
The map showing the location of crime incidents that were occured. Each point is geocoded to the actual location of an address/house/store.
The size of the symbol at each point location represents the number of crime that were happened at the location. The higher cases, the larger cicle size.
# Buffalo Council Districts dataset
api_url="https://data.buffalony.gov/resource/u5mx-ugvy.geojson"
cd_gdf=gpd.read_file(api_url)
cd_gdf.tail()
crime_gdf = crime_gdf.to_crs('epsg:3857')
cd_gdf = cd_gdf.to_crs('epsg:3857')
joindf = gpd.sjoin(crime_gdf,cd_gdf,how='inner',op='intersects')
joindf.tail()
joindf['council_district']=joindf.council_district.astype(str)
ct = joindf.copy()
ct = ct.council_district.groupby(joindf['council_district']).count().sort_values(ascending=False)
ctdf=ct.to_frame(name='counts').reset_index()
ctdf.tail()
nCases = pd.merge(cd_gdf,ctdf,left_on="dist_name",right_on="council_district")
nCases['centroids'] =nCases['geometry'].centroid
nCases = nCases.set_geometry('centroids')
maxcir = 60
maxcnt = nCases.counts.max()
nCases['radius']=(nCases.counts/maxcnt*maxcir)
nCases['radius']=nCases['radius'].astype(float).round().astype(int)
nCases.head()
output_file("/content/CrimeDistributionMaps.html",
title="Crime Incidents by Council Districts in Buffalo")
f1 = figure(title = "Crime incident cases in Buffalo by Council Districts", tools=TOOLS, toolbar_sticky=False,**kwargs)
f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False
TA20 = nCases.drop('geometry',axis=1).copy()
point_source_1 = GeoJSONDataSource(geojson=TA20.to_json())
poly_source = GeoJSONDataSource(geojson=cd_gdf.to_json())
Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
areas = f1.patches('xs','ys',source=poly_source,name="Council Districts",fill_color=None,fill_alpha=0.6,line_color="black",line_width=0.5)
c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[
("Council Districts","@dist_name"),
("Number of Cases","@counts")]
f1.add_tools(c_hover)
heading = Div(text="""<h1>Point Distribution Map</h1>\
<p> The map below show locations and distribution of crime incident cases in Buffalo.\
<p> Use the tools to the right of map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")
layout = column(heading, row(f1),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)
The map showing the number of confirmed crime cases by Buffalo Council Districts. The center of each council districts polygon boundary is used to represent the total number of confirmed crime cases within each council districts. The higher the number, the larger the circle size.
In summary, Ellicott is the council district that have the highest number of crime cases - 45,497 cases. While South council district have the lowest number of cases - 17,308 cases. However, almost haft of South council district area is parks/places without human physical addresses where none crime cases recorded in this dataset. So that we cannot conclude that South is safest council district in Buffalo.
Moreover, Delaware council districh have the second lowest number of cases - 18,578. And again, its area have a big park.
So that, in oder to tell the dangerous level of council districts, we will need to draw a point frequency map of all crime cases at duplicated locations.
Allcases = allHl.loc[allHl.parent_incident_type != None ].copy()
Allcases.to_crs('epsg:3857',inplace=True)
Allcases.tail()
maxcir = 60
maxcnt = Allcases.counts.max()
Allcases['radius']=(Allcases.counts/maxcnt*maxcir)
Allcases['radius']=Allcases['radius'].astype(float).round().astype(int)
Allcases.head()
Allcases.to_crs('epsg:3857',inplace=True)
f1 = figure(title = "Location of crime cases in Buffalo", tools=TOOLS, toolbar_sticky=False,**kwargs)
f1.add_tile(tileProvider)
f1.title.text_font_style = 'italic'
f1.title.text_font_size = '14pt'
f1.axis.visible=False
point_source_1 = GeoJSONDataSource(geojson=Allcases.to_json())
poly_source = GeoJSONDataSource(geojson=cd_gdf.to_json())
Circle1=f1.circle('x','y',size='radius',fill_color='blue',line_color='blue',fill_alpha=0.5,source=point_source_1)
areas = f1.patches('xs','ys',source=poly_source,name="Council Districts",fill_color=None,fill_alpha=0.6,line_color="red",line_width=0.9)
c_hover= HoverTool(renderers=[Circle1])
c_hover.point_policy = "follow_mouse"
c_hover.tooltips=[("Address","@address_1," "@council_district"),
(" " , " "),
("Number of Cases","@counts")]
f1.add_tools(c_hover)
heading = Div(text="""<h1>All Crimes Point Frequency Map</h1>\
<p> The map below show locations and frequencies of all crime cases in Buffalo by council district.\
<p> Use the tools to the right of each map to pan, zoom, etc... \
Hover over a property to see the address and number of cases.</p> \
<p><b><i>Data Source</i></b> =<a href = https://data.cityofnewyork.us/Housing-Development/Housing-Litigations/59kj-x8nc target='_blank'>NYC Open Data.</a></p>.\
<p style="font-size:9px;">Maps created 4/10/2022 by Nguyet Que T. Tran.</p>""", sizing_mode="stretch_both")
layout = column(heading, row(f1),sizing_mode='stretch_both',margin=(5,5,5,5))
show(layout)